127 research outputs found

    Toward an Ethical Framework for the Text Mining of Social Media for Health Research: A Systematic Review

    Get PDF
    Background: Text-mining techniques are advancing all the time and vast corpora of social media text can be analyzed for users' views and experiences related to their health. There is great promise for new insights into health issues such as drug side effects and spread of disease, as well as patient experiences of health conditions and health care. However, this emerging field lacks ethical consensus and guidance. We aimed to bring together a comprehensive body of opinion, views, and recommendations in this area so that academic researchers new to the field can understand relevant ethical issues.Methods: After registration of a protocol in PROSPERO, three parallel systematic searches were conducted, to identify academic articles comprising commentaries, opinion, and recommendations on ethical practice in social media text mining for health research and gray literature guidelines and recommendations. These were integrated with social media users' views from qualitative studies. Papers and reports that met the inclusion criteria were analyzed thematically to identify key themes, and an overarching set of themes was deduced.Results: A total of 47 reports and articles were reviewed, and eight themes were identified. Commentators suggested that publicly posted social media data could be used without consent and formal research ethics approval, provided that the anonymity of users is ensured, although we note that privacy settings are difficult for users to navigate on some sites. Even without the need for formal approvals, we note ethical issues: to actively identify and minimize possible harms, to conduct research for public benefit rather than private gain, to ensure transparency and quality of data access and analysis methods, and to abide by the law and terms and conditions of social media sites.Conclusion: Although social media text mining can often legally and reasonably proceed without formal ethics approvals, we recommend improving ethical standards in health-related research by increasing transparency of the purpose of research, data access, and analysis methods; consultation with social media users and target groups to identify and mitigate against potential harms that could arise; and ensuring the anonymity of social media users

    Population Data Science: The science of data about people

    Get PDF
    Introduction Societal and individual benefits of data-intensive science are substantial but raise challenges of balancing individual privacy and public good, while building appropriate governance and socio-technical systems to support data-intensive science. We set out to define a new field of inquiry to move collective interests forward. Objectives and Approach Our objectives were: 1. To create a concise definition of the emerging field of Population Data Science; 2. To highlight the characteristics and challenges of Population Data Science; 3. To differentiate Population Data Science from existing fields of data science and informatics; and 4. To discuss the implications and future opportunities for Population Data Science. Objectives 1 and 2 were met largely through International Population Data Linkage Network (IPDLN) member engagement, Objective 3 was evaluated via literature review, and Objective 4 was achieved through iterative and collective work on a peer-reviewed position paper. Results We define Population Data Science succinctly as the science of data about people. It is related to, but distinct from, the fields of data science and informatics. A broader definition includes four characteristics of: i) data use for positive impact on individuals and populations; ii) bringing together and analyzing data from multiple sources; iii) identifying population-level insights; and iv) developing safe, privacy-sensitive and ethical infrastructure to support research. One implication of these characteristics is that few individuals or organisations possess all of the requisite knowledge and skills comprising Population Data Science, so this is by nature a multi-disciplinary “team science” field. There is a need to advance various aspects of science, such as data linkage technology, various forms of analytics, and methods of public engagement. Conclusion/Implications These implications are the beginnings of a research agenda for Population Data Science, which if approached as a collective field, will catalyze significant advances in our understanding of society, health, and human behavior and increase the impact of our research

    LINKAGE: Factors in selecting a data linkage approach

    Get PDF

    A UKSeRP for SAIL: striking a balance

    Get PDF
    ABSTRACT Objectives Whilst the current expansion of health-related big data and data linkage research are exciting developments with great potential, they bring a major challenge. This is how to strike an appropriate balance between making the data accessible for beneficial uses, whilst respecting the rights of individuals, the duty of confidentiality and protecting the privacy of person-level data, without undue burden to research. Approach Using a case study approach, we describe how the UK Secure Research Platform (UKSeRP) for the Secure Anonymised Information Linkage (SAIL) databank addresses this challenge. We outline the principles, features and operating model of the SAIL UKSeRP, and how we are addressing the challenges of making health-related data safely accessible to increasing numbers of research users within a secure environment. Results The SAIL UKSeRP has four basic principles to ensure that it is able to meet the needs of the growing data user community, and these are to: A) operate a remote access system that provides secure data access to approved data users; B) host an environment that provides a powerful platform for data analysis activities; (C) have a robust mechanism for the safe transfer of approved files in and out of the system; and (D) ensure that the system is efficient and scalable to accommodate a growing data user base. Subject to independent Information Governance approval and within a robust, proportionate Governance framework, the SAIL UKSeRP provides data users with a familiar Windows interface and their usual toolsets to access anonymously-linked datasets for research and evaluation. Conclusion The SAIL UKSeRP represents a powerful analytical environment within a privacy-protecting safe haven and secure remote access system which has been designed to be scalable and adaptable to meet the needs of the rapidly growing data linkage community. Further challenges lie ahead as the landscape develops and emerging data types become more available. UKSeRP technology is available and customisable for other use cases within the UK and international jurisdictions, to operate within their respective governance frameworks

    Exploring the Use of Genomic and Routinely Collected Data: Narrative Literature Review and Interview Study

    Get PDF
    Background: Advancing the use of genomic data with routinely collected health data holds great promise for health care andresearch. Increasing the use of these data is a high priority to understand and address the causes of disease.Objective: This study aims to provide an outline of the use of genomic data alongside routinely collected data in health researchto date. As this field prepares to move forward, it is important to take stock of the current state of play in order to highlight newavenues for development, identify challenges, and ensure that adequate data governance models are in place for safe and sociallyacceptable progress.Methods: We conducted a literature review to draw information from past studies that have used genomic and routinely collecteddata and conducted interviews with individuals who use these data for health research. We collected data on the following: therationale of using genomic data in conjunction with routinely collected data, types of genomic and routinely collected data used,data sources, project approvals, governance and access models, and challenges encountered.Results: The main purpose of using genomic and routinely collected data was to conduct genome-wide and phenome-wideassociation studies. Routine data sources included electronic health records, disease and death registries, health insurance systems,and deprivation indices. The types of genomic data included polygenic risk scores, single nucleotide polymorphisms, and measuresof genetic activity, and biobanks generally provided these data. Although the literature search showed that biobanks released datato researchers, the case studies revealed a growing tendency for use within a data safe haven. Challenges of working with thesedata revolved around data collection, data storage, technical, and data privacy issues.Conclusions: Using genomic and routinely collected data holds great promise for progressing health research. Several challengesare involved, particularly in terms of privacy. Overcoming these barriers will ensure that the use of these data to progress healthresearch can be exploited to its full potential

    Advancing cross-centre research networks: learning from experience, looking to the future

    Get PDF
    Introduction Many jurisdictions have programmes for the large-scale reuse of health and administrative data that would benefit from greater cross-centre working. The Advancing Cross centre Research Networks (ACoRN) project considered barriers and drivers for joint working and information sharing using the UK Farr Institute as a case study, and applicable widely. Objectives and Approach ACoRN collected information from researchers, analysts, academics and the public to gauge the acceptability of sharing data across institutions and jurisdictions. It considered international researcher experiences and evidence from a variety of cross centre projects to reveal barriers and potential solutions to joint working. It reviewed the legal and regulatory provisions that surround data sharing and cross-centre working, including issues of information governance to provide the context and backdrop. The emerging issues were grouped into five themes and used to propose a set of recommendations. Results The five themes identified were: organisational structures and legal entities; people and culture; information governance; technology and infrastructure; and finance and strategic planning. Recommendations within these included: standardised terms and conditions including agreements and contractual templates; performance indicators for frequency of dataset sharing; communities of practice and virtual teams to develop cooperation; standardised policies and procedures to underpin data sharing; an accredited quality seal for organisations sharing data; a dashboard for data availability and sharing; and adequate resource to move towards greater uniformity and to drive data sharing initiatives. Conclusion/Implications The challenges posed by cross-centre information sharing are considerable but the public benefits associated with the greater use of health and administrative data are inestimable, particularly as novel and emerging data become increasingly available. The proposed recommendations will assist in achieving the benefits of cross-centre working

    The Good, the Bad, the Clunky and . . . the Outcomes

    Get PDF
    Background There are there are considerable challenges to be addressed so the benefits of administrative data for research can be realised. Significant headway is being made, but there is great scope and appetite for further improvement. Objectives This study set out to explore good practice, barriers and bottlenecks in effective administrative data use, and to gain suggestions on how to share the good, solve the bad and improve the clunky issues. Methods Using the ESRC-funded UK Administrative Data Research Network (ADRN) as the case study, a qualitative survey, focusing on the data use pathway, was carried out across the network. This encompassed a set of 18 questions spanning from acquisition to archiving. Survey responses were grouped into six themes: data acquisition; approval processes; controls on access and disclosure; data and metadata; researcher support; and data reuse and retention. The resulting information matrix was presented to participants at the All Hands meeting (April-May 2017) to facilitate discussion. Findings Survey responses were received from across the network (N=27) and 95 people took part in the workshop. The combined information from the survey and workshop was used to inform set of 18 recommendations across the 6 themes, and this has been used by the ADRN directors to develop an action plan for implementation. Conclusions The ADRN has broken new ground in overcoming many challenges in using administrative data for research in the UK. The recommendations and action plan show how further improvements will be made in the ADRCs, and the findings of this study are relevant more widely to other organisations working with administrative data
    corecore